Exploiting Data Locality on Scalable

نویسندگان

  • Siegfried Benkner
  • Thomas Brandes
چکیده

OpenMP ooers a high-level interface for parallel programming on scalable shared memory (SMP) architectures providing the user with simple work-sharing directives while relying on the compiler to generate parallel programs based on thread parallelism. However, the lack of language features for exploiting data locality often results in poor performance since the non-uniform memory access times on scalable SMP machines cannot be neglected. HPF, the de-facto standard for data parallel programming, ooers a rich set of data distribution directives in order to exploit data locality, but has mainly been targeted towards distributed memory machines. In this paper we describe an optimized execution model for HPF programs on SMP machines that avails itself with the mechanisms provided by OpenMP for work sharing and thread parallelism while exploiting data locality based on user-speciied distribution directives. This execution model has been implemented in the ADAPTOR HPF compilation system and experimental results verify the eeciency of the chosen approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Scalable Inline Cluster Deduplication Framework for Big Data Protection

Cluster deduplication has become a widely deployed technology in data protection services for Big Data to satisfy the requirements of service level agreement (SLA). However, it remains a great challenge for cluster deduplication to strike a sensible tradeoff between the conflicting goals of scalable deduplication throughput and high duplicate elimination ratio in cluster systems with low-end in...

متن کامل

A Case for Fine-Grain Adaptive Cache Coherence

As transistor density continues to grow geometrically, processor manufacturers are already able to place a hundred cores on a chip (e.g., Tilera TILE-Gx 100), with massive multicore chips on the horizon. Programmers now need to invest more effort in designing software capable of exploiting multicore parallelism. The shared memory paradigm provides a convenient layer of abstraction to the progra...

متن کامل

Issues in Designing Scalable Systems with k - ary n - cube cluster - cOrganization

This paper emphasizes on design issues to build scalable systems with the upcoming trend of k-ary n-cube cluster-c interconnection. Such organization takes advantage of VLSI integration and advancements in packaging technologies to interconnect multiple processors into a chip or a board in a cost-eeective manner. While the cluster organization helps exploiting locality, the clusters are connect...

متن کامل

Exploiting Location Awareness for Scalable Location-Independent

We are building a wide-area location service that tracks the current location of mobile and replicated objects. The location service should support up to 1012 objects on a worldwide scale. To support this huge number of objects, the workload of the location service is distributed over multiple nodes. Our load distribution method is unique in that it is aware of the (geographical) location of no...

متن کامل

Exploiting Location Awareness for Scalable Location-Independent Object IDs

We are building a wide-area location service that tracks the current location of mobile and replicated objects. The location service should support up to 1012 objects on a worldwide scale. To support this huge number of objects, the workload of the location service is distributed over multiple nodes. Our load distribution method is unique in that it is aware of the (geographical) location of no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000